System and method for uniformly administering parameters for a load distribution and load control of a computer platform

ABSTRACT

A system and method for uniformly administering parameters for a load distribution and load control of a computer platform includes a processor and a storage mechanism. Software components run on the processor, and include a manager component, which is a load model manager, that uniformly administers parameters for a load distribution and load control of the system. The platform further includes a catalog stored in the storage mechanism via which the load model manager administers the parameters. The catalog includes a plurality of tables. Each of the tables respectively includes a load model which is a complete, consistent set of parameters that influence the load distribution of the load control of the computer system.

BACKGROUND OF THE INVENTION

Field of the Invention

The invention relates to a platform and method of using this platform ofa computer system for managing the load control and load distributionwithin the system.

The document European Patent EP-A-0 346 039 discloses a computer systemthat administers parameters for a load distribution and load control ofa computer system.

A switching node is a computer system that processes jobs in a contextof delay times that are demanded by the customer and by internationalstandards. For broadband switching nodes, these jobs lie in verydifferent areas, with different delay time demands, and deal withtraffic volume of a very different type that fluctuates over time. Inaddition to coming from switching technology for data and voice traffic,these jobs come from task areas like protocol processing (for example,Common Channel Signaling System No. 7 (CCS-7) signalling, broadband UserNetwork Interface (broadband subscriber signaling) (UNI)-interface,broadband Network Network Interface “variable-bandwidth trunk” signaling(NNI)-signalling, Global Title Translation, Private Network NetworkInterface (PNNI) protocol, Public Access Interface, Private BranchExchange Access, etc.), management of the signalling networks(particularly for network outages), operation of the operator interface,billing, traffic statistics, administration of subscriber lines, linecircuits, routing data and tariff zones, as well as from error handling,which assures that the “down time” (time during which the entireswitching node or sub-components are not available) is minimized.

Since this computer deals with network connections, both normal andother operating conditions (e.g., initialization in which a network-widecommunication is triggered) that must proceed in an optimally timelymanner. The delay demands thus vary with the operating conditions,requiring an adequate reaction.

The real-time behavior needed from the switching node makes specificdemands of the node architecture and of the operating system. To providesuch real-time behavior, the processor run time capacities must bedistributed as well as possible onto the various application programs atall times.

In the new Asynchronous Transfer Mode (ATM) switching node, time budgetvalues are made available to the operating system in order to guaranteethis, so that the OS can guarantee a minimum access time to theprocessor for the various applications.

Since telecommunication networks are subject to constant change withrespect to performance features, architecture and size, the switchingnodes in telecommunication networks must be flexible in terms of thefunctionalities that they offer and must be scalable in size.

In particular, a switching computer must be able to grow, withminimal/no interruptions in operation: this includes a HW upgrade forincreasing the processor performance that already exists, a SW upgradefor upgrading performance features, and (in conjunction with theseupgrades under certain circumstances) a redistribution of the timebudget at the processor, including other parameters (e.g., loadthresholds for the load control, scan intervals, and protocol budgets),these parameters should be capable of being changed interruption-free.

Up to now, these parameters were distributed among the various users,i.e., the parameters were declared in the application programs. Theinterface administration with respect to the parameters likewise tookplace by the users.

The totality of the load control parameters and load distributionparameters (“load control parameters”) are approximately four-digits inlength and are fundamentally important for the load distribution (timedivision onto the various applications according to the throughput anddelay time demands, with many but not all of these being used forscheduling) and load control. The previous distributed deposit of theseload control parameters at the various users and their administration bythem (via a corresponding plurality of user interfaces) would lead to aconsiderable, if not insoluble, engineering problem. This problem wouldbe in implementing a situation-suited adaptation (in the configurationand the run time) of the currently valid parameter values for thepurpose of optimum load distribution and load control.

SUMMARY OF THE INVENTION

The present invention is based on the object of solving the aboveproblem.

The solution is achieved by a load model manager that uniformlyadministers the parameters that influence the load control and the loaddistribution of the system and, thus, the performance of the system in a“load model catalog”.

BRIEF DESCRIPTION OF THE DRAWINGS

An exemplary embodiment of the invention is described in greater detailbelow on the basis of the drawings.

FIG. 1 is a schematic illustration of the invention using an exemplaryATM node;

FIG. 2 is a task sequence diagram showing process scheduling;

FIG. 3 is a block diagram showing the relationship of the load modelmanager for various components within the system; and

FIG. 4 is a data structure diagram illustrating the load model catalog.

The architecture utilizing an exemplary ATM node is presented first withreference to FIG. 1.

The most important control of the broadband node is implemented in ascalable cluster of main processors. All processors communicate with ashared, internal protocol via an ATM switching network (ATM fabric). Anindividual main processor with its share of the overall software of thesystem is referred to below as a platform.

The cluster under consideration in this example can be comprised of 128individual platforms in its maximum configuration. Hardware and basicsoftware (for example, operating system and load control software) arethe same for all platforms. I.e., this base configuration mustcorrespond to the demands of all known usage types/applications. At thesame time, it must also be so comprehensive that a demand for new uses(potentially unknown at the present time) can be efficientlyaccommodated at any time. These usage types can be realized by variousapplication software packets on the platforms. There are platforms for:

-   -   Call processing (can be multiply present in the node)    -   Protocol conversions (can be multiply present in the node)    -   Operation, administration and maintenance (can be multiply        present in the node)    -   Management of the network signaling (is present exactly once in        the node)    -   Operator interface (is present exactly once in the node).

The totality of software present on a platform for the above tasks,including basic software, is also referred to as “load type”. Beyondthese “pure” load types, platforms having “combined load types” mustalso be offered so that minimal configurations can be offered customizedand cost-beneficially. It is precisely these platforms that are achallenge for the careful matching of the guaranteed minimal runningtime budget of the applications, since the competition, particularly ofdifferent types of user programs, for the allocation of the processorrun time is greatest in this situation.

The operating system and the load control are explained in greaterdetail below.

Process Term and CHILL

The applications discussed above are realized as processes and are alsoreferenced as such below. A run time-related unit is understood here asa process. The process has a runnable software unit that has attributes;those that influence its access right to the processor are of primaryinterest. These attributes are statically implemented. The process withits attributes is verbally explained (Commitee Consultatif de Telephonieet Telegraphie High Level Language —CHILL) and compiler-supported. Uponsystem start-up, the processes are offered in a form suited to theoperating system, that takes their attributes into consideration.

The two attributes of the processes important for the real-time behaviordetermine:

-   1) the affiliation to a task group, and-   2) the affiliation to a category of delay time demands within this    task group.

The task groups of those units for which a minimum access time to theprocessor of the platform on which they reside is guaranteed.

The category of the delay time demands of a process within a task groupdetermines the way that processes that belong to the same task group onthe same platform compete for calculating time when they aresimultaneously ready to run. This attribute, which distinguishes theprocesses within a task group, is what is referred to as the processpriority (see below under the section “scheduling”).

Operating System Part: Service Addressing

When a great system expansion of a node occurs, it is desirable todistribute the requests onto the cluster such that optimally noindividual platform is overworked. The operating system part called“service addressing” assumes this task.

The operating system of each and every platform knows all functionsoffered and addressable in the entire cluster as “services”. A servicecan be offered on the platform on which it is requested and/or on one ormore other platforms of the cluster. In order to avoid the communicationoverhead between the platforms, the requesting user is informed of theresponse address of the service of its own platform—if present. When,however, this platform is in an overload condition (also see the section“load control” below), then a different platform in the cluster isselected that also offers this service, insofar as such a differentplatform is in a lower load condition. This is possible because acurrent load image of the overall cluster is situated on every platform.

Operating System Part: Scheduling

For a minimal configuration of the cluster, the computer capacity of anindividual processor must be carefully distributed onto theapplications.

To that end, the operating system of each and every platform mustsupport the processes that compete for calculating time on one and thesame platform, particularly under high load, such that it can adhere tothe real-time demands at the switching nodes. In order to enable this,the processes are combined into task groups.

FIG. 2 shows an example of the scheduling of the processes. A guaranteedminimum part of the processor time is defined as a “run time budget”(i.e., net calculating time) for each of the task groups by load controlparameters. These task groups are therefore also referred to as “virtualCPUs” (VCPU). The processes are assigned to a FIFO waiting list per VCPUand priority, and wait there for allocation of the CPU.

The scheduling routine works by time slicing and within time slices. Atthe beginning of each and every time slot, it observes a time creditaccount per VCPU in that it compares the sum of all measured runningtimes of the processes of a VCPU to the guaranteed time budget. Theresult of this comparison is a VCPU-individual time credit.

In the VCPU having the greatest time budget, that waiting process thatis waiting at the first location of the waiting list having the highestpriority is the first to receive calculating time (see FIG. 2).

When the process is suspended during this time slot, then the nextprocess from the same VCPU (and, potentially, from the same waitinglist) follows. When no other process is waiting from this VCPU, then thescheduling algorithm performs as it did at the beginning of the timeslot. At the beginning of the next time slot, the scheduling algorithmis applied anew.

Load Control

The load control SW cyclically compares the current load situation ofits own platform to platform-individual thresholds and potentiallyidentifies the processor as being in an overload condition. The overloadcondition is established for a processor when the offered, total loadlies higher than the load thresholds that are established for thisprocessor. This load information is distributed system-wide, so that theservice addressing is in the position to “spare” overloaded platforms.

As soon as a processor is in the overload condition, the load controlalso determines, on the basis of the run time budget of the VCPUs, whichof these VCPUs has exceeded its budget. These VCPUs are then referred toas overloaded. Suitable processes within these overloaded VCPUs areinformed and requested to reduce their load offering to the processor.

According to the above-presented MP operating system structure of theATM switching node, both the load control as well as the processscheduling in the operating system work with a number of load controlparameters in order to create the pre-conditions for an optimumreal-time behavior. Since operating system and load control are generic(i.e., uniform) for all MP platforms, these data for all possiblefunction configurations of a platform must be available on each MP.Moreover, they must be “activatable” interactively on the platform(e.g., during the configuration) in order to take the functionconfiguration into account. Furthermore, the data must be adapted to therespective operating condition (e.g., recovery) of the platform in orderto optimally respond to the various demands of these operatingconditions.

Load Model Management

The illustrated demands and mechanisms such as scalability of thenetwork node, distribution of the load offering onto the platforms,division of the computer capacity of a platform onto the various users,a process concept with static process attributes, as well as the loadcontrol are still not all of the mechanisms that are related to the loadcontrol parameters. Over and above this, there are also clockinformation, run time limitations for the internal protocol,considerations related to the basic load and further sub-budgetdivisions and thresholds that are not discussed here.

When one considers the quantity and complexity of the load controlparameters as well as the request profile and function fabric in whichthey reside, it becomes clear that it is necessary to implement them insurveyable fashion and to have them controlled by their own software.

All parameters that influence the load control and the load distributionof the system are therefore stored in a catalog replicated on eachplatform. This creates a surveyability that is particularly importantwhen the change at one of the parameters interferes with the values ofother parameters in the design phase. At the run time, only a singlemanagement function (LM Manager) still has access to this catalog, whichalso maintains the interfaces to the parameter users. This managementfunction is also receptive for triggers that can request changes of thecurrently valid parameters via defined interfaces at the run time (forexample, via interactive commands by the user). It bears theresponsibility for informing all users affected by the changes. Thestructure of the parameter catalog is selected such that online changesof the valid parameters are meaningfully limited. An engineering of thecomplex influencing quantities is possible and can be implementederror-free only on the basis of this form of the management and thestructuring.

The load management model structures the complex and extensive multitudeof load control parameters in a way that, on the one hand, assures thesurveyability for the purpose of engineerability, including error-freeimplementability. On the other hand, however, a flexibility is alsoenabled that envelopes pre-conditions for a situation-suited adaptation(in the configuration and at the run time) of the currently validparameter values for the purpose of optimum performance in view of loaddistribution and load control.

Interactively, only a single table can be selected and “activated” fromthe parameter catalog at one time, and thus no value from another tableof the catalog can be employed (as long as this table is “active”).Changing operating conditions of the platform can in turn select onlyamong pre-fabricated data sets and change between these.

FIG. 3 shows the relationships of the load model manager for variouscomponents within the system. There are three different types of usersof the LM manager. A first user type (user 1) is only in communicationwith the LM manager. The recovery component is cited as an exemplaryuser of this type. A second user type (user i) is in communication bothwith the LM manager as well as with the load control; the schedulingcomponent is to be cited as an exemplary user of this type. Finally,third user (user n) only receives values from the load control; theswitching technology component is cited as an exemplary user of thistype.

Operator Interface

Each individual platform of the duster must be configured, for exampleupon initialization of the switching node. This means:

-   1) that the load type is allocated and loaded and-   2) that a load model table is selected from the load model catalog    replicated on all platforms.

The type of organization of the central load control parameters in theload model catalog is discussed in greater detail in the section “loadmodel catalog”.

A load model table from the load model catalog is clearly allocated to aload type, by which a plurality of tables can be present for one loadtype (for example, for different load expectations or operatingconditions).

After the selection of the table, only the values recited in it, whichrepresent a consistent set of parameters, are then valid until theinterruption-free replacement of this table. This denotes a meaningfullimitation that facilitates operation of the node for the user andprovides him with the security of always placing parameters matched toone another into operation. The selection possibility from various loadmodel tables provides him with the possibility of tuning the platforms.

The operator can establish suitable VCPU budgets at the various VCPUsdependent on the load anticipation. Since he can replace the currentlyvalid load model table interruption-free during operation, the budgetscan be unproblemmatically adapted to modified conditions (for example,due to a HW or SW upgrade in the node).

Values within a table can be changed by a SW patch that can be checkedfor consistency and supplied by the manufacturer. An operator commandcan then select the patched table free of operating interruptions (seeFIG. 3), and the control software (load model management) takes care ofinforming the affected applications.

For coordination of the selection, see the section “control software”below and FIG. 4.

Load Model Catalog

The load model catalog represents the organization structure of thoseload model parameters that are centrally implemented. In contrast tothis, process attributes, of course, are decentrally declared at everyindividual process; see the above section “process term.”

The load model catalog is composed of individual load model tables thatcan be interactively allocated to each platform (see the section“operator interface”).

Each load model table from the load model catalog contains, first, aplurality of load models that contain the budgets to be guaranteed forthe VCPUs. Various load models are required in order to be able to takethe different demands of the platform operating conditions (such asstartup or normal operation) into account.

Second, the load model table also contains the clock information,overload limits and run time limitations for sequencing the internalprotocol, the size of the load reserve, the basic load as well as atable that supports the allocation of the processes onto the VCPUs.

Control Software (Load Model Manager)

The control SW (load model management or management component) is anoperating system-proximate process that sees to the readout of thecorrect parameter data upon initialization or platform status changes.

The load model (LM) manager enables the interruption-free replacement ofthe currently valid load model parameters. To that end, it makesinterfaces for the events to be triggered available. Furthermore, itinforms the SW affected by this event. The operating system as well asprocesses such as the load control belong to this SW. When, due to thereplacement of the load model table, the budget is completely withdrawnfrom a VCPU, then, of course, the appertaining service must also bewithdrawn.

Depending on the internal operating condition of the appertainingplatform, one or the other load model is applied. When, for example, arecovery occurs during operation (also see FIG. 3 or 4), then recoverycan turn to load management so that a switch is made from normaloperating load model to a specifically recovery load model, so that themodified runtime demands of recovery are taken into account. Load modelmanagement independently informs the SW affected by this event such as,for example, scheduling and the load control.

1. A computer system platform, comprising: a processor; a storagemechanism; software components that run on said processor, comprising amanager component, which is a load model manager, that uniformlyadministers parameters for a load distribution and load control of saidcomputer system; said platform further comprising: a catalog stored insaid storage mechanism via which said load model manager administerssaid parameters, said catalog comprising a plurality of tables, each ofsaid tables respectively comprising a load model which is a complete,consistent set of parameters that influence said load distribution andsaid load control of said computer system.
 2. The platform according toclaim 1, wherein one of said tables can be defined as a currently validtable assisted by said load model manager.
 3. The platform according toclaim 2, wherein a selection accomplished by said definition of saidtable is dependent on a load type of a platform.
 4. The platformaccording to claim 2, wherein a selection accomplished by saiddefinition of said table is dependent on an operating condition of aplatform.
 5. The platform according to claim 2, further comprising aninterface to said load model manager via which said definition of saidtable can be triggered.
 6. The platform according to claim 2, whereinsaid load model manager comprises interfaces to said software componentsof via which said load model manager informs said software components ofcurrently valid load control parameters from said currently valid table.7. The platform according to claim 2, wherein said load model managercomprises an interface via which a system manufacturer can modifycontent of a table during operations.
 8. A method for uniformlyadministering parameters for a load distribution and load control of acomputer system platform having a processor, comprising the steps of:storing a catalog in a storage mechanism of said platform; providing aplurality of tables in said catalog; providing a load module in each ofsaid plurality of tables, said load module having a consistent set ofparameters that influence said load distribution and said load controlof said computer system; uniformly administering parameters for saidload distribution and said load control by a load model manager softwarecomponent that runs on said processor, utilizing said stored catalog. 9.The method according to claim 8, further comprising the step of definingone of said plurality of tables as a currently valid table assisted bysaid load model manager.
 10. The method according to claim 9, whereinsaid step of defining a currently valid table is dependent on a loadtype of a platform.
 11. The method according to claim 9, wherein saidstep of defining a currently valid table is dependent on an operatingcondition of a platform.
 12. The method according to claim 9, furthercomprising the steps of: providing an interface to said load modelmanager; and triggering said step of defining said currently valid tablevia said interface.
 13. The method according to claim 9, furthercomprising the steps of: providing interfaces between said load modelmanager and software components of said platform; informing, by saidload model manager, said software components of currently valid loadcontrol parameters from said defined currently valid table.
 14. Themethod according to claim 9, further comprising the steps of: providingan interface for said load model manager; modifying, by a systemmanufacturer via said interface, content of one of said tables duringoperations.